System and method of operation for remotely operated vehicles with improved position estimation

ABSTRACT

The present invention provides a system and method of position estimation for remotely operated vehicles, even in noisy environments. In some embodiments, a position estimation engine includes a 2D projection module, a registration module, a position estimation module, and an efficiency module. The improved position estimation starts with a real frame from a video and a virtual image that is the projection of the 3D elements given the ROV&#39;s noisy position. The position estimate begins by projecting each of the visible structures individually and then registers them with the real image. Then the 2D transformation resulting from the registration process is used to estimate the 3D ROV&#39;s position. Then, the ROV&#39;s position estimates are robustly combined. Because this position estimation needs to run in real-time or near real-time, an efficiency module improves the efficiency of the position estimation.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a U.S. National Stage Application of InternationalApplication No. PCT/IB2018/055976 filed Aug. 8, 2018, which designatesthe United States.

The disclosures of published patent documents referenced in thisapplication are hereby incorporated in their entireties by referenceinto this application in order to more fully describe the state of theart to which this invention pertains.

The present invention relates to a system of operation for remotelyoperated vehicles (“ROV”), and methods for its use. In particular, thepresent invention provides a system and method of operation for ROVswith improved position estimation.

BACKGROUND OF THE INVENTION

Exploration of the last frontier on earth, the sea, is largely driven bythe continuing demand for energy resources. Because humans are not ableto endure the pressures induced at the depths at which energyreconnaissance occurs, we have become increasingly reliant upontechnology such as autonomous vehicles and ROV technology. The future ofthe exploration of the oceans is only as fast, reliable and safe as theavailable technology. Thus, new innovations in exploration are needed.

SUMMARY OF THE INVENTION

The Oil and Gas (O&G) industry has subsea fields that need to bemaintained. For security reasons, maintenance operations are performedby Remotely Operated Vehicles (ROVs). These ROVs can be used forconstruction, repairing, routine operations or visual inspection of thesubsea structures.

However, visibility is typically poor in underwater environments,especially in vicinity to the seafloor due to floating sediment. Tominimize this issue, Augmented Reality (AR) can be used to superimpose3D models of the subsea structures with the video feed. This way, evenif the structures cannot be seen in the video due to poor visibility, avirtual representation of the structure is displayed in the correctposition, helping the pilot to navigate properly.

The structures' appearance as well as their position is known and,therefore, a 3D model of the field can be built before the start of asubsea mission. Moreover, ROVs contain sensors that output theirposition and depth and, as such, can be positioned in the 3D scene.Therefore, it is possible to compute the 3D elements that should bevisible in the video, as well as their position.

A problem with this approach is that the ROV's positional telemetry maybe noisy, which may lead to a misalignment between the virtual and thereal elements.

This disclosure provides systems and methods relating to the operationof ROVs with improved position estimation in noisy environments.Although embodiments and examples are provided in the context ofundersea missions, one skilled in the art should appreciate that theaspects, features, functionalities, etc., discussed in this disclosurecan also be extended to virtually any type of complex navigationproject.

BRIEF DESCRIPTION OF THE DRAWINGS

The aforementioned and other aspects, features and advantages can bebetter understood from the following detailed description with referenceto the accompanying drawings wherein:

FIG. 1A shows a diagrammatic view of a system, according to someembodiments;

FIG. 1B shows a diagrammatic view of a system and its associatedfunctions, according to some embodiments;

FIGS. 2A and 2B depict alternative views of a user interface of a systemaccording to some embodiments;

FIGS. 3A and 3B show software architecture overviews of a system,according to some embodiments;

FIG. 3C is a diagrammatic illustration of networked systems, accordingto some embodiments;

FIG. 4 depicts modules for achieving hybrid 3D imagery, and a method fortheir use, according to some embodiments;

FIG. 5A illustrates calculations for aligning a virtual video and a realvideo, according to some embodiments;

FIG. 5B illustrates hybrid 3D imagery obtained by superimposing avirtual video and a real video, according to some embodiments;

FIGS. 6A-6E depict several views of a navigation interface, according tosome embodiments;

FIG. 7 depicts a position estimation engine for achieving efficientposition estimation, and a method for its use, according to someembodiments of the inventions;

FIG. 8A depicts an example of a misalignment between the real andvirtual images;

FIG. 8B depicts an example of a misalignment between the real andvirtual images;

FIG. 9 illustrates an overlapped frame, a real frame, a first virtualimage, and a second virtual image; and

FIG. 10 depicts an overview of a method of using the position estimationmodule to improve a position estimation.

DETAILED DESCRIPTION OF THE INVENTION

The invention provides a system for operating a remotely operatedvehicle (ROV) comprising:

a) a database module of 3D elements operable to represent objectsdisposed in an operation environment of the ROV;

b) a virtual video generating module operable to generate a virtualvideo incorporating the 3D elements;

c) a video camera mounted to the ROV operable to capture a real video ofthe operation environment of the ROV;

d) a synchronizing module operable to synchronize an angle and positionof a virtual camera with an angle and position of the video cameramounted to the ROV, wherein the virtual camera defines a field of viewfor the virtual video; and

e) a position estimation engine operable to align a virtual videoelement with a real video element to create hybrid 3D imagery andgenerate a ROV position estimation, the position estimation enginecomprising: a projection module; a registration module; a positionestimation module; and an efficiency module.

The systems and methods described herein may further have one or more ofthe following features, which may be combined with one another or anyother feature described herein unless clearly mutually exclusive.

The projection module may be operable to project visible 3D structuresinto a corresponding 2D virtual image with a 2D virtual imagefield-of-view, and each of the visible 3D structures may be projectedinto its corresponding 2D virtual image.

The 2D virtual image field-of-view may be larger than the field of viewfor the virtual video.

The projection module may generate N virtual images.

The registration module may register the N virtual images with a realframe from the real video.

The registration module may determine edge maps to register the Nvirtual images.

The registration module may map at least one virtual edge point to atleast one corresponding real edge point with a similarity transformationmatrix.

The registration module may generate the similarity transformationmatrix in a closed form using Umeyama's method.

The registration module may apply the similarity transformation matrixin an iterative process until the similarity transformation matrix isclose to identity or exceeds a maximum number of iterations.

The position estimation module may generate the ROV position estimationbased at least in part on the N virtual images of the projection moduleand the similarity transformation matrix of the registration module.

The position estimation module may determine a rigid body transformationTi.

The position estimation module may generate a new ROV positionestimation by right multiplying a previous ROV position estimation withTi.

The position estimation module may render 3D structures from the new ROVposition estimation.

Before rendering the 3D structures from the new ROV position estimation,the position estimation module may remove outlier position estimationsto generate a set of remaining position estimations and determine anupdated ROV position estimation comprising a mean of the set ofremaining position estimations.

The efficiency module may use a full registration process for a specificstructure when (i) a predetermined number of virtual edge points or realedge points are lost, (ii) after a predetermined number of frames, (iii)or when the specific structure enters the real video for a first time.

The invention also provides a method of operating a remotely operatedvehicle (ROV) comprising:

-   -   a) obtaining 3D bathymetry data using multibeam sonar;    -   b) storing 3D elements in a database module, the 3D elements        representing objects disposed in the ROV's operation environment        and comprising the 3D elements comprising the 3D bathymetry        data;    -   c) generating a virtual video of the 3D elements;    -   d) synchronizing an angle and position of a virtual camera with        an angle and position of a video camera mounted to the ROV,        wherein the virtual camera defines a field of view for the        virtual video; and    -   e) aligning a virtual video element with a real video element to        create hybrid 3D imagery, wherein aligning comprises:    -   f) projecting visible 3D structures into a corresponding 2D        virtual image with a 2D virtual image field-of-view;    -   g) generating N virtual images;    -   h) registering the N virtual images with a real frame from the        real video; and    -   i) generating a ROV position estimate based at least in part on        the corresponding 2D virtual image and the registered N virtual        images.

A method may further comprise using a full registration process for aspecific structure.

A method may further comprise:

-   -   a) mapping virtual edge points to corresponding real edge        points;    -   b) applying an iterative process to the mapping; and    -   c) generating a ROV position estimation.

A method may further comprise rendering 3D structures from the ROVposition estimation.

A method for generating an ROV position estimation may further comprise:

-   -   a) removing outlier position estimations;    -   b) generating a set of remaining position estimations; and    -   c) determining an updated ROV position estimation comprising a        mean of the set of remaining position estimations.

The invention also provides a computer program product, stored on acomputer-readable medium, for implementing any method according toinvention as described herein.

As mentioned supra, various features and functionalities are discussedherein by way of examples and embodiments in a context of ROV navigationfor use in undersea exploration. In describing such examples andexemplary embodiments, specific terminology is employed for the sake ofclarity. However, this disclosure is not intended to be limited to theexamples and exemplary embodiments discussed herein, nor to the specificterminology utilized in such discussions, and it is to be understoodthat each specific element includes all technical equivalents thatoperate in a similar manner.

Definitions

The following terms are defined as follows:

-   -   3D elements; 3D objects—Data defining three-dimensional shapes,        obtained by modeling sonar-derived input or user-determined        input.    -   Abstraction; layer of abstraction—A characteristic of executable        software, wherein differing data formats are standardized into a        common format such that components are made compatible.    -   Data engine—A collection of modules, according to an embodiment        of this invention, which is responsible for at least the        acquisition, storing and reporting of data collected over the        course of a ROV mission.    -   Fail state—A state, defined by a user or by a standard, wherein        the functionality of the system, according to some embodiments        of the invention, has decreased to an unacceptable level.    -   Luminance threshold—A system-determined value of RGB (Red,        Green, Blue) pixel color intensity which defines a visible but        transparent state for the images depicted by a digital image        output device.    -   Module—A combination of at least one computer processor,        computer memory and custom software that performs one or more        defined functions.    -   Navigation engine—A collection of modules, according to some        embodiments of this invention, which is responsible for making        the Navigation Interface interactive, and for producing data for        displaying on the Navigation Interface.    -   Positioned; geopositioned; tagged—Having a location defined by        the Global Positioning System of satellites and/or acoustic or        inertial positioning systems, and optionally having a location        defined by a depth below sea level.    -   Position estimation engine—A collection of modules, according to        some embodiments, which is responsible for position estimation.    -   ROV—A remotely operated vehicle; often an aquatic vehicle.        Although for purposes of convenience and brevity ROVs are        described herein, nothing herein is intended to be limiting to        only vehicles that require remote operation. Autonomous vehicles        and semi-autonomous vehicles are within the scope of this        disclosure.    -   Visualization engine—A collection of modules, according to an        embodiment of this invention, which is responsible for producing        the displayed aspect of the navigation interface.        System

Hardware and Devices

Referring now to the drawings, wherein like reference numerals designateidentical or corresponding parts throughout the several views, FIG. 1Adiagrammatically depicts a system according to an embodiment of theinvention. This system includes an ROV and its associatedinstrumentation 1, an operating system housed within computer hardware 3and a user interface and its associated devices 2. The operating system3 mediates interaction between the ROV 1 and the user 4, such that theuser may submit commands and inquiries for information to the ROV 1, andobtain mechanical responses and data output from the ROV 1.

As seen from FIG. 1B, the operating system 3 may receive liveinformation obtained by the ROV's 1 multibeam 3D real-time sonar,telemetry data, positioning data and video as well as programmed 3Dobjects from a database 5, and process that data to provide live 3Dmodels of the environment for both augmented reality and full 3Drendering displayed at the user interface 2. The user interface 2 mayalso be used to display video obtained using the ROV's 1 digitalinstrumentation, including, for example, cameras and other sensors. TheROV 1 utilized in the system of the present invention is equipped withconventional instrumentation for telemetry and positioning, which areresponsive to the commands mediated by the operating system 3.

In one embodiment of the invention, the hardware for the operatingsystem 3 includes a high-end rack computer that can be easily integratedwith any ROV control system. The several software modules that furtherdefine the operating system will be described in further detail infra.

With reference to FIGS. 2A and 2B, the human-machine interface includesat least one monitor 7, and preferably three interactive monitors 7 fornavigation. According to one embodiment shown in FIG. 2A, the centermonitor 7 provides a video feed and augmented reality (AR), while theside monitors provide an expansion of the field of view of operation. Inanother aspect, the side monitors may allow the user to have a panoramicview of the ROV environment using full 3D visualization from the pointof view of the ROV. As seen in FIG. 2B, the interaction between the userand the system may utilize joysticks 8, gamepads, or other controllers.In another embodiment, the user interface 2 may employ touch ormulti-touch screen technology, audio warnings and sounds, voicecommands, a computer mouse, etc.

Functional Modules

Rather than developing a different operating system 3 for each brand andmodel of ROV 1, the embodiments described herein work by abstraction,such that the disclosed operating system 3 and associated hardware workthe same way with all ROVs 1. For example, if one component delivers“$DBS, 14.0, 10.3” as a depth and heading coordinates, and anothercomponent delivers “$HD, 15.3, 16.4” as heading and depth coordinates,these data strings are parsed into their respective variables:Depth1=14.0, Depth2=16.4, Heading1=16.4, Heading2=15.3. This parsingallows both system to work the same way, regardless of the data formatdetails.

By developing a layer of abstraction of drivers for communicationbetween the operating system 3 and the ROV hardware, the user 4 isprovided with seamless data communication, and is not restricted tousing particular ROV models. This abstraction further allows users 4 andsystems 3 to communicate and network information between severalsystems, and share information among several undersea projects. The useof a single system also allows for cost reduction in training,maintenance and operation of this system.

FIG. 3A depicts a software architecture overview illustrating thecomponent parts of the ROV 1, user interface 2 and operating system 3.Software counterparts are provided for the ROV's telemetry, positioning,video and sonar instrumentation. In order to implement user functionsincluding planning, logging, navigation, supervision and debriefing, theoperating system 3 provides a navigation engine, a visualization engineand a data engine. The operating system 3 is networked such thatconnected services and external command units can provide real-time datainput. One of such external command units may be configured as awatchdog. The external watchdog system may perform periodic checks todetermine whether the system is working properly, or is in a fail state.If the system is in a fail state, the watchdog may change the monitors'inputs, or bypass them, to a conventional live video feed until thesystem is operating correctly.

FIG. 3B depicts a further software architecture overview illustratingthat the operating system 3, which mediates the aforementioned userfunctions, is networked to provide communication between a multi touchsupervision console and a pilot or pilots. FIG. 3C illustrates yetanother level of connectivity, wherein the navigation system of a firstROV may share all of its dynamic data with the navigation system ofanother ROV over a network.

Visualization Engine

As seen from FIGS. 1B and 3A, the operating system's 3 visualizationengine further includes modules for implementing 3D imagery,two-dimensional (“2D”) imagery, and providing a real-time environmentupdate. These modules are shown in FIG. 4, which illustrates in astepwise fashion how the system operates in some embodiments to createsuperimposed hybrid 3D imagery.

A 3D database module 10 includes advanced 3D rendering technology toallow all the stages of ROV operation to be executed with reference to avisually re-created 3D deep-water environment. This environment iscomposed by the seabed bathymetry and modeled equipment, e.g.,structures of ocean energy devices.

As discussed above, the main sources of image data may be pre-recorded3D modeling of sonar data (i.e., computer-generated 3D video) andpossibly other video data; live sonar data obtain in real time; videodata obtained in real time; user-determined 3D elements; and textual orgraphical communications intended to be displayed on the user interfacescreen. The geographical position and depth (or height) of any elementsor regions included in the image data are known by GPS positioning, byuse of acoustic and/or inertial positioning systems, and/or by referenceto maps, and/or by other sensor measurements.

In some embodiments, a virtual video generation module 11 is providedfor using the aforementioned stored 3D elements or real-time detected 3Delements to create a virtual video of such 3D elements. The virtualvideo generation module 11 may work in concert with a synchronizationmodule 12.

The synchronization module 12 aligns the position of the virtual cameraof the virtual video with the angle and position of a real camera on anROV. According to some embodiments the virtual camera defines a field ofview for the virtual video, which may extend, for example, between 45and 144 degrees from a central point of view.

As illustrated in FIG. 5A, the alignment of virtual and real cameraangles may be accomplished by calculating the angle between the headingof the ROV and the direction of the camera field of view; calculatingthe angle between the vertical of the ROV and the direction of thecamera field of view; and calculating the angle between the ROV and thegeographic horizon. These calculated angles are then used to determinean equivalent object screen coordinate of the digital X-Y axis atdetermined time intervals or anytime a variable changes value.

A superimposition module 13, whose function is additionally diagrammedin FIG. 5B, is provided for superimposing the generated virtual video 20and the synchronized, real-time video 21 acquired by the ROV's digitalcamera. The result is hybrid superimposed 3D imagery 22, wherein thesystem effectively draws the generated 3D environment on top of thenon-visible part of the video feed, thus greatly enhancing visibilityfor the ROV pilot. More specifically, the superimposition softwaredivides the camera-feed video and the generated 3D video into severallayers on the z-buffer of the 3D rendering system. This permits theflattening of the layers and their superimposition, which simulatesspatial perception and facilitates navigation.

Yet another feature of the superimposition module 13 is that either oneor both of the virtual 20 or real videos 21 may be manipulated, basedupon a luminance threshold, to be more transparent in areas of lesserinterest, thus allowing the corresponding area of the other video feedto show through. According to some embodiments, luminance in theRed-Green-Blue hexadecimal format may be between 0-0-0 and 255-255-255,and preferably between 0-0-0 and 40-40-40. Areas of lesser interest maybe selected by a system default, or by the user. The color intensity ofimages in areas of lesser interest is set at the luminance threshold,and the corresponding region of the other video is set at normalluminance. For the example shown in FIG. 5B, the background of thevirtual video 20 is kept relatively more transparent than theforeground. Thus, when the real video 21 is superimposed on the virtual3D image 20, the real video 21 is selectively augmented primarily withthe virtual foreground, which contains a subsea structure of interest.

Navigation Engine

The on-screen, 2D Navigation Interface for the ROV pilot involvessuperimposing geopositioned data or technical information on a 2Drendering system. Geopositioning or geo-tagging of data and elements isexecuted by reference to maps or to global positioning satellites. Theresulting Navigation Interface, as seen in FIGS. 6A-6D, is reminiscentof aviation-type heads up display consoles. In the case of subseanavigation, the display is configured to indicate ROV 1 position basedon known coordinates, and by using a sonar system that records 3D imagesfrom a ROV's position for later navigation. In this way, the embodimentsdescribed herein provide immersive visualization of ROV's operation.

FIG. 6A illustrates the superposition of textual information and symbols30 onto the 2D video rendering of the ROV user interface. FIG. 6Billustrates the superposition of 3D elements 31 onto the videorendering. The superposition of these data onto the video feed isuseful, not only for navigating and controlling the ROV 1, but also forexecuting the related planning and supervising functions of theoperating system 3. This superposition may be accomplished in a similarway to the superimposition of the video feeds, i.e., by obtaining screencoordinates of an object, and rendering text and numbers near thosecoordinates.

The planning module enables engineers and/or supervisors to plan one orseveral ROV missions. Referring again to FIG. 6A, an important featureof the planning module is the input and presentation of bathymetryinformation 32 through 3D visualization. As seen on the NavigationInterface, waypoints 33 and checkpoints 34 are superimposed onto thevideo feed. These elements may be identified, for example, by number,and/or by distance from a reference point. In other words, in additionto superimposing the technical specifications and status information 30for the ROV 1 or other relevant structures, the Navigation Interfacealso provides GPS-determined positions for navigation and pilotinformation.

In some embodiments, procedures 35, including timed procedures (fixedposition observation tasks, for example), may be included on theNavigation Interface as text. Given this procedural information, a ROVpilot is enabled to anticipate and complete tasks more accurately. Auser may also use the system to define actionable areas. Actionableareas are geopositioned areas in the undersea environment that trigger asystem action when entering, leaving, or staying longer than adesignated time. The triggered action could be an alarm, notification,procedure change, task change, etc.

Referring to FIG. 6C, using a series of rules established in theplanning module, or by manual input, the system may show more or less 2Dgeo-tagged information on the Navigation Interface. For example, as seenat 36, during a ROV operation when the pilot is at 100 meters from ageo-tagged object, the system may show only general information relatingto the overall structure, or specific information needed for a specificcurrent task in the nearby area. As the pilot approaches the geo-taggedstructure, shown at 37, the system may incrementally show moreinformation about components of that structure. This dynamic and manuallevel of detail control may apply to both textual and symbolicinformation 30, as well as to the augmentation of 3D elements 31.

With reference to FIG. 6D, the planning module may also provideon-screen information relating to flight path 38. As seen in FIG. 6E,another important feature of the invention is embodied by a minimap 39,i.e., a graphic superimposed on the video, which may include a varietyof different representations, such as small icons representing targetobjects. The minimap 39 may show the cardinal points (North, South,East, West) in a 3D representation, optionally in addition to arepresentation of a relevant object in tridimensional space. The minimap39 may be positioned in a corner, and may be moved, dismissed andrecalled by the user.

Data Engine

The data engine, which mediates the data warehousing and data transferfunctions of the invention, therefore incorporates the logging andsupervising modules.

The logging module logs or records all information made available by theoperating system and saves such data in a central database for futureaccess. The available information may include any or all telemetry,sonar data, 3D models, bathymetry, waypoints, checkpoints, alarms ormalfunctions, procedures, operations, and navigation records such asflight path information, positioning and inertial data, etc.

An essential part of any offshore operation providing critical data tothe client after the operation is concluded. After the operation, duringthe debriefing and reporting stage, the debriefing and reporting modulemay provide a full 3D scenario or reproduction of the operation. Thedebriefing and reporting module may provide a report on the plannedflight path versus the actual flight path, waypoints, checkpoints,several deviations on the plan, alarms given by the ROV, includingdetails of alarm type, time and location, procedures, checkpoints, etc.ready to be delivered to the client. Accordingly, the operating systemis configured to provide four-dimensional (three spatial dimensions plustime) interactive reports for every operation. This enables fastanalysis and a comprehensive understanding of operations.

Yet another software element that interacts with of the NavigationInterface is the supervisor module. Execution of the supervisor moduleenables one or more supervisors to view and/or utilize the NavigationInterface, and by extension, any ROV 1 being controlled from theinterface. These supervisors need not share the location of the ROVpilot or pilots, but rather may employ the connectivity elementsdepicted in FIGS. 3B and 3C. A plurality of multi touch supervisionconsoles may be used at different locations. For example, one could havenine monitors connected to three exemplary hardware structures,including an ROV 1, where only one operating system 3 gathered the ROVdata and shared information with the others. Alternatively, between oneand 12 networked monitors may be used, and preferably between 3 and 9may be used. Networking provided as shown in FIGS. 3B and 3C may reducerisks, such as human error, in multiple-ROV operations, even thosecoordinated from separate vessels. Networking through the supervisormodule allows for the sharing of information between ROV systems,personnel and operations across the entire operation workflow.

Position Estimation Engine

As discussed herein with respect to FIGS. 1B and 3A, the operatingsystem's 3 visualization engine further includes modules forimplementing 3D imagery, implementing 2D imagery, and providing areal-time environment update. These modules are shown in FIG. 4, whichillustrates how the system operates in some embodiments to createsuperimposed hybrid 3D imagery using a 3D database module 10, a virtualvideo generation module 11, a synchronization module 12, and asuperimposition module 13.

According to some embodiments, yet another feature of the operatingsystem 3 is the position estimation engine that includes moreefficiently aligning the virtual elements with the real elements while,at the same time, obtaining a more robust and efficient estimate of theROV's position. In some embodiments, the position estimation engineworks with the virtual video generation module 11 and thesynchronization module 12 (e.g., using the generated virtual video 20and the synchronized, real-time video 21 acquired by the ROV's digitalcamera). In some embodiments, the position estimation engine may updatethe bearing, heading, and ROV depth values (e.g., as discussed hereinwith respect to FIG. 5A) This is further described and shown withrespect to FIG. 7, which illustrates how the system operates in someembodiments to efficiently improve position estimation.

In some embodiments, as shown in FIG. 7, a position estimation engine 70is illustrated that includes a 2D projection module 71, a registrationmodule 72, a position estimation module 73, and an efficiency module 74.The improved position estimation starts with a real frame from the videoand virtual image that is the projection of the 3D elements given theROV's noisy position. The position estimate begins by projecting each ofthe visible structures individually and then registers them with thereal image. Then the 2D transformation resulting from the registrationprocess is used to estimate the 3D ROV's position. Then, the ROV'sposition estimate are robustly combined. Because this positionestimation needs to run in real-time or near real-time, an efficiencymodule 74 is used to improve the efficiency of the position estimation.

A positional sensor (such as the sensor providing live informationobtained by the ROV's 1 multibeam 3D real-time sonar, telemetry data,positioning data and/or video as shown and described with respect toFIG. 1B), despite being noisy, provides a good first estimate of theROV's position. In some embodiments, it is assumed that the projectionof the 3D structures in 2D (virtual image) is close to reality and,therefore, the system can search in the vicinity of the 3D structuresfor the real structures in the video. Then, the system can register eachof the visible 3D structures with the real video independently. Thisregistration can be performed at each frame of the video. However, insome embodiments, the registration may be prohibitively computationallyexpensive for a real time scenario, unless efficiency is improved asdiscussed below with respect to the efficiency module 74.

A 2D projection module 71 projects each of the N visible 3D structuresinto N 2D virtual images. However, problems may arise if 3D structuresare not fully visible in the virtual image. As examples, there are atleast three situations where the 3D structures are not fully visible inthe virtual image: 1) the real structure is, in fact, not fully visiblein the real image; 2) the ROV's positional noise incorrectly causes theprojection of the structure to lie outside the image; and 3) both of theabove. The second and third situations are described further herein withrespect to FIGS. 8A and 8B. If important points are incorrectlyprojected outside the image, the registration process (e.g., theregistration process described herein with respect to registrationmodule 72) may be negatively impacted.

FIGS. 8A and 8B depict examples of a misalignment between the real andvirtual images FIG. 8A depicts a 2D virtual image 80 a containing a realstructure 81 a and a 3D structure projection 82 a with a section 83 thatis not visible in the virtual image 80 a. Section 83 is part of the 3Dstructure projection 82 a that is not visible because it is projectedoutside the image (e.g., because the ROV's positional noise incorrectlycauses the projection to lie outside the real structure). The parts ofthe 3D structure projection 82 a that are projected inside the realstructure 81 a may be insufficient to determine if the scale is correct.FIG. 8B depicts a 2D virtual image 80 b containing a real structure 81 bwith a section 85 that is not visible in the virtual image 80 b and a 3Dstructure projection 82 b with a section 84 that is not visible in thevirtual image 80 b. Thus, in FIG. 8B, both the real and virtual objectslie partially outside the field-of-view of the virtual image 80 b.

To mitigate problems where 3D structures are not fully visible in thevirtual image, the field-of-view of the virtual image may be increased.Therefore, points that would not be included using the samefield-of-view as the real camera may be included. A visual example ofthis process is depicted in FIG. 9.

FIG. 9 illustrates a real frame 90 with a field-of-view 90 a containinga first real structure 91 with a section 92 that would not be visible inthe field-of-view 90 a of real frame 90 and a second real structure 93,a first virtual image 94 with a field-of-view 94 a and containing afirst 3D structure 95 with a section that would not be visible in thefield-of-view 90 a of real frame 90, a second virtual image 96 with afield-of-view 96 a and containing a second 3D structure 97, and a thirdvirtual image 98 with a field-of-view 98 a. As shown in FIG. 9, tomitigate problems where 3D structures are not fully visible in thevirtual image, virtual objects (e.g., the first 3D structure 95 and thesecond 3D structure 97) are projected into a virtual image (e.g., thethird virtual image 98) with a larger field of view than the real frame.Thus, parts of the objects that would be left out of the image (e.g.,section 92), for example due to noise in the ROV's sensors, are includedin the registration process.

After the 2D projection module 71 projects into 2D, the registrationmodule 72 has N virtual images to register with the real frame. In someembodiments, the registration module 72 starts by extracting edges fromboth the real and virtual images. The registration module 72 may usedifferent edge detection methods. In some embodiments, for example whereit is important to perform in real-time or near real-time, theregistration module 72 may extract horizontal and vertical gradientsfrom the real and virtual images. Then, the results may be thresholdedto obtain two binary edge maps.

The registration module 72 may use the binary edge maps to register theimages. In some embodiments, the registration module 72 may use avariation of the Iterative Closest Point (ICP) algorithm (e.g., due inpart to the assumption that the two images will be almost correctlyaligned) to account for differences in scale. For each 2D point in thevirtual edge map, the registration module 72 finds the closest point inthe real edge map. Then, the registration module 72 finds the similaritytransformation matrix S_(i) that maps the virtual edge points to thecorresponding real edge points. In some embodiments, the registrationmodule 72 obtains the similarity transformation matrix S_(i) in a closedform using Umeyama's method. The registration module 72 applies thetransformation matrix to the points in the virtual edge map and, as inthe ICP method, repeats the process until the transformation matrix isclose to the identity or it exceeds the maximum number of iterations.

In some embodiments, for each structure I=[x_(i1), . . . , x_(iK)], theregistration module 72 applies the similarity transformation matrixS_(i) to the set of structure's 2D points x_(i) resulting in the newposition x′_(i):x′ _(i) =S _(i) .x _(i).

In some embodiments, a similarity transformation matrix is used insteadof a rigid body transformation because the virtual objects might beprojected at a different distance from the camera than in reality.Because closer objects look bigger in an image, sizes may need to beincreased or decreased to achieve a proper alignment.

After the registration module 72 produces a matrix (e.g., the similaritytransformation matrix S_(i)) for each of the N visible 3D structures,this transformation registers the N virtual images with the real frame.However, the new projected 2D location of the structures might beinconsistent with the model of the world, for example due to noise inthe registration process. In other words, if these structures wereback-projected into 3D, their position can potentially be incorrect.Moreover, in some embodiments, the field structures are static and onlythe ROV moves. Thus, a position estimation module 73 may use these 2Dsimilarity transformations to update the ROV's 3D position, thusreducing the ROV sensor's positional noise. If P is the known 3×4projection matrix, and X_(i) is the 3D homogeneous coordinates ofstructure I, then:x=P.X _(i) , x′ _(i) =S _(i) .x _(i).

Therefore, because P is the dot product between the camera's intrinsicparameters and the camera's position which, in this case, is the ROV'sposition, the position estimation module 73 may determine a rigid bodytransformation T_(i) such that:x′ _(i) =P.T _(i) .X _(i)

The position estimation module 73 may determine T_(i) in closed form byback-projecting x′_(i) into 3D:P ⁺ .x′ _(i) =T _(i) .X _(i)where P⁺ is the pseudo-inverse of P.

The position estimation module 73 may update the ROV's position by rightmultiplying its previous position with T_(i). However, a problem mayarise where the position estimation module 73 ends up with N differentposition estimates for the ROV. Thus, the position estimation module 73may remove outlier positions and set the new ROV position to be the meanof the remaining estimated positions. After this process, the positionestimation module 73 may render the 3D structures from the new ROVposition.

FIG. 10 depicts an overview of a method of using the position estimationmodule to improve a position estimation. FIG. 10 illustrates a method100 for improving (e.g., by denoising) a position estimation by aligningthe 3D scene with reality. Method 100 includes block 101 where imagesare projected onto 2D (e.g., as described herein with respect toprojection module 71 and with respect to FIGS. 7, 8A, 8B, and 9). Method100 includes block 102 and 102 a where, for each structure, the imagesare registered (e.g., as described herein with respect to registrationmodule 72). Method 100 includes block 102 b where, for each structure,an improved position estimate is determined (e.g., as described hereinwith respect to position estimation module 73). Method 100 includesblock 103 where the outlier positions are removed and block 104 wherethe position estimation is set to be the mean of inlier positions.

In some embodiments, an efficiency module 74 may be used. The mostcomputationally expensive part of the disclosed embodiments is theregistration module 72 and its usage of an iterative process to findpoint correspondences. The efficiency module 74 may further improveefficiency by saving the point matches used for the registration andthen tracking them in the following real and virtual frames. Because thetracking may deteriorate with time, the efficiency module 74 may use afull registration process for a given structure in variouscircumstances. For example, the efficiency module 74 may use the fullregistration process for a given structure (i) when a sufficient orpredetermined number of points being tracked are lost on either thevirtual or real frame, (ii) after k frames, or (iii) when the structureenters the image for the first time. [The efficiency module 74 may keeptrack of point correspondences between x′_(i) and x_(i). When thesepoint correspondences are not available, a full registration process mayinclude extracting features (e.g., edge features) and using the ICPmethod to register the real and virtual structures. In some embodiments,the feature extraction and ICP method may be computed at block 102 a.This is a technical solution that increases efficiency because these arethe most time-consuming tasks in the process. By using the efficiencymodule 74, these point correspondences do not need to be computed.

In some embodiments, the full registration process may be too slow torun at the desired speed (e.g., greater than 20 times per second). Thismeans the estimations may come either too late to be useful or may needadaptation. The efficient registration process runs much faster but usesdata provided by the full registration process and may need to refreshthe full process at some point. Consequently, in some embodiments, theefficiency module 74 may initially use the full registration process fora period of time (e.g., up to a few seconds) and then use the efficientprocess to provide an accurate position estimate per-frame (or veryclose to per-frame). This may be done on a per-structure basis. As anexample only, if three structures were on screen, method 100 could runthe efficient process for two of the structures—and estimate the ROVpositions from those two structures—while the third structure is stillbeing initialized.

Thus, there has been shown and described a system and method relating toimproved position estimation of ROVs. The method and system are notlimited to any particular hardware or software configuration. The manyvariations, modifications and alternative applications of the inventionthat would be apparent to those skilled in the art, and that do notdepart from the scope of the invention are deemed to be covered by theinvention.

What is claimed is:
 1. A system for operating a remotely operatedvehicle (ROV) comprising: a database module of 3D elements operable torepresent objects disposed in an operation environment of the ROV; avirtual video generating module operable to generate a virtual videoincorporating the 3D elements; a video camera mounted to the ROVoperable to capture a real video of the operation environment of theROV; a synchronizing module operable to synchronize an angle andposition of a virtual camera with an angle and position of the videocamera mounted to the ROV, wherein the virtual camera defines a field ofview for the virtual video; and a position estimation engine operable toalign a virtual video element with a real video element to create hybrid3D imagery and generate a ROV position estimation, the positionestimation engine comprising: a projection module; a registrationmodule; a position estimation module; and an efficiency module.
 2. Thesystem of claim 1, wherein the projection module is operable to projectvisible 3D structures into a corresponding 2D virtual image with a 2Dvirtual image field-of-view, and wherein each of the visible 3Dstructures is projected into its corresponding 2D virtual image.
 3. Thesystem of claim 2, wherein the 2D virtual image field-of-view is largerthan the field of view for the virtual video.
 4. The system of claim 3,wherein the projection module generates N virtual images.
 5. The systemof claim 4, wherein the registration module registers the N virtualimages with N real frames from the real video.
 6. The system of claim 5,wherein the registration module determines edge maps to register the Nvirtual images.
 7. The system of claim 6, wherein the registrationmodule maps at least one virtual edge point to at least onecorresponding real edge point with a similarity transformation matrix.8. The system of claim 7, wherein the registration module generates thesimilarity transformation matrix in a closed form using Umeyama'smethod.
 9. The system of claim 8, wherein the registration moduleapplies the similarity transformation matrix in an iterative processuntil the similarity transformation matrix is close to identity orexceeds a maximum number of iterations.
 10. The system of claim 9,wherein the position estimation module generates the ROV positionestimation based at least in part on the N virtual images of theprojection module and the similarity transformation matrix of theregistration module.
 11. The system of claim 10, wherein the positionestimation module determines a rigid body transformation Ti.
 12. Thesystem of claim 11, wherein the position estimation module generates anew ROV position estimation by right multiplying a previous ROV positionestimation with Ti.
 13. The system of claim 12, wherein the positionestimation module renders 3D structures from the new ROV positionestimation.
 14. The system of claim 13, wherein, before rendering the 3Dstructures from the new ROV position estimation, the position estimationmodule removes outlier position estimations to generate a set ofremaining position estimations and determines an updated ROV positionestimation comprising a mean of the set of remaining positionestimations.
 15. The system of claim 1, wherein the efficiency moduleuses a full registration process for a specific structure when (i) apredetermined number of virtual edge points or real edge points arelost, (ii) after a predetermined number of frames, (iii) or when thespecific structure enters the real video for a first time.
 16. A methodof operating a remotely operated vehicle (ROV) comprising: obtaining 3Dbathymetry data using multibeam sonar; storing 3D elements in a databasemodule, the 3D elements representing objects disposed in the ROV'soperation environment and comprising the 3D elements comprising the 3Dbathymetry data; generating a virtual video of the 3D elements;synchronizing an angle and position of a virtual camera with an angleand position of a video camera mounted to the ROV, wherein the virtualcamera defines a field of view for the virtual video; and aligning avirtual video element with a real video element to create hybrid 3Dimagery, wherein aligning comprises: projecting visible 3D structuresinto a corresponding 2D virtual image with a 2D virtual imagefield-of-view; generating N virtual images; registering the N virtualimages with a real frame from the real video; and generating a ROVposition estimate based at least in part on the corresponding 2D virtualimage and the registered N virtual images.
 17. The method of claim 16,further comprising using a full registration process for a specificstructure.
 18. The method of claim 16, further comprising: mappingvirtual edge points to corresponding real edge points; applying aniterative process to the mapping; and generating a ROV positionestimation.
 19. The method of claim 18, further comprising rendering 3Dstructures from the ROV position estimation.
 20. The method of claim 19,wherein generating an 3D position estimation further comprises: removingoutlier position estimations; generating a set of remaining positionestimations; and determining an updated ROV position estimationcomprising a mean of the set of remaining position estimations.
 21. Acomputer program product comprising instructions which, when the programis executed by a computer, cause the computer to carry out the steps ofclaim 16.